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1.0 Introduction 

B is a computer language directly descendant from BCPL [1,2]. A tutorial introduction to B on the H6070 
computer, under MH-TSS, is given in [3]. The reader is urged to read the examples in section 9 in parallel 
with the language description, to get the flavor of actual B applications. 

B was designed and implemented by D. M. Ritchie and K. L. Thompson. The B runtime support for MH- 
TSS was designed and implemented by S. C. Johnson. 

2.0 Syntax 

The next several sections describe the syntax of B. Throughout, reserved words and required characters are 
underlined, while names not underlined represent syntactic classes. Thus, the if statement is syntactically 
described by: 

if ( rvalue ) statement 1 else statement2 ; 

2.1 Syntactic Overview 

B is like most other programming languages in that it has programs, made up of statements, many of the 
statements contain expressions, and the expressions are made up of operators, names, and constants, with 
parentheses used in the usual way to alter the order of evaluation. 

B differs from most languages in that it has no types: no type declarations are needed, and no type conver¬ 
sions are performed. A variable takes on a type (integer, function, vector, etc.) from its context in an 
expression or statement. Another difference is the rich operator set of B, including logical and shift opera¬ 
tors. B is recursive, but pays only a small penalty for being so. 

2.2 Comments and Character Sets 

Comments are delimited as in PL/I by /* and */. 

B programs consist of tokens which are names, constants, and operators. Normally tokens are separated by 
blanks, tabs, new-lines, or comments; in addition, the compiler treats the characters (){}[],;?: and 
maximal sequences of the characters -t—*/<>& ! = as complete tokens. 

The character set used in B is ascii. 

The characters A through Z, a through z, _, ., and backspace are alphabetic characters and may be used in 
names. The characters 0 through 9 are digits and may be used in constants or names; however, a name 
may not begin with a digit. 

There are certain members of the character set which are acceptable in string and character constants, but 
sometimes hard to represent. In these contexts, escape sequences are provided. The following escape 
sequences are currently defined: 
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*0 

null 

*e 

end-of-file 

* ( 

{ 

*) 

} 

*t 

tab 

■k : k 

* 

* f 

’ 

* H 

ii 

*n 

new line 


All keywords in the language are recognized only in lower case. Keywords are reserved. 

3.0 Rvalues and Lvalues 

An rvalue is a binary bit pattern of a fixed length. On the H6070 it is 36 bits, rvalues may have different 
meanings: integers, labels, vectors, functions, etc.. The actual kind of thing represented is called the type 
of the rvalue. 

A B expression can be evaluated to yield an rvalue, but its type is undefined until the rvalue is used in some 
context. It is then assumed to represent an object of the required type. For example, in the following 
expression 

a (b/c+d[3]) 

is taken to be of type function, since it is followed by a left parenthesis; b and c are taken to be signed inte¬ 
gers; and d is taken to be of type integer vector, since it is followed by a left bracket. 

There is no check to insure that there are no type mismatches. Similarly, there are no type conversions. 

An lvalue is a binary number ("address") representing a storage location; thus, for eveiy lvalue there is an 
associated rvalue, the contents of the lvalue. A variable in B (as in most other computer languages) has 
both an lvalue (its address) and an rvalue (its contents). Writing: 

x = y 

causes the contents (rvalue) of y to be stored into the location (lvalue) of x. The Rvalue is on the Right side 
of the assignment, the Lvalue is on the Left. 

Some things (such as constants, expressions, and function calls) are not permitted to have lvalues in B; 
thus: 


3 = x 
a+b = x 
f (3) = x 

are all illegal. There are unary operators, * and &, for converting between rvalues and lvalues. Unary *, 
the indirection operator, treats an rvalue as an lvalue; on the H6070, the rightmost 18 bits of the rvalue are 
taken as a machine address. Thus 

*3 = x 

is legal, and stores the contents of x into memory location 3. 

Unary &, the address operator, does the inverse operation: if x has an lvalue, &x is the rvalue which has the 
lvalue of x as the rightmost 18 bits (with the other bits 0). Thus, &x can be thought of as a pointer to x. If 
x has an lvalue, *(&x) is the same as x; this is just the same as saying that x can be used by indirecting 
through a pointer to x. The opposite identity, that &(*x) is the same as x, is always formally true but less 
useful in practice. 

Since addresses are stored in the right half of a word, adding 1 to a pointer gives a pointer to the next word 
in memory. For example, the contents of the word immediately following variable x can be obtained by 
writing *(&x+l). 
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4.0 Expression Evaluation 

The evaluation of expressions (in particular, the binding of the operators) is done in the same order as the 
sub-sections of this section. Thus expressions referred to as operands of ’+’ (section 4.4) are expressions 
defined in sections 4.1 to 4.3. The binding of operators at the same level (left to right, right to left) is spec¬ 
ified in each sub-section. 

This section discusses the binding and meaning of expressions; the order of evaluation of expressions is 
undefined. Thus, writing; 

f(2) + g(3) 

one cannot assume that f will be called before g. 

4.1 Primary Expressions 

1. A name is an lvalue of one of three storage classes (automatic, external and internal). 

2. A decimal constant is an rvalue. It consists of a digit between 1 and 9 followed by any number of 
digits between 0 and 9. The value of the constant should not exceed the maximum value that can be 
stored in a word. 

3. An octal constant is the same as a decimal constant except that it begins with a zero. It is then inter¬ 
preted in base 8. 

4. A character constant is represented by ' followed by one or more characters (possibly escaped) fol¬ 
lowed by another '. It has an rvalue equal to the value of the characters packed and right adjusted, 
with zero fill. Obviously, the number of characters in a character constant is a machine dependent 
quantity; on the H6070, up to four characters are allowed. 

5. A string is any number of characters between characters. The characters are packed into adjacent 
words (lvalues sequential) and terminated with the character ' *e' (ascii EOT character) . The 
rvalue of the string is the lvalue of the word containing the first character. See section 8.0 for library 
functions used to manipulate strings in a machine independent fashion. Escapes are available to 
allow special characters, such as tab and newline, to be used in string and character constants (See 
section 2.2). 

6. Any expression in () parentheses is a primary expression. Parentheses are used to alter order of bind¬ 
ing. 

7. A vector reference v[n] is a primary expression v followed by any expression n in [ ] brackets. The 
two expressions are evaluated to rvalues, added and the result is used as an lvalue, v can be thought 
of as a pointer to the base of a vector, while n can be thought of as the offset in the vector. Thus, if x 
is a vector, x points to a block of locations whose names are x[0], x[l], ... . x[0] is the same as *x, 
x[l] is the same as *(x+l), and so on. A string is a special case of a vector, where the contents of the 
words x[0], x[l],... contain packed characters. 

8. A function call f(a,b, ... ) is a primary expression f followed by any number of expressions in () 
parentheses separated by commas, a, b, . . . are evaluated (in an unspecified order) to rvalues and 
assigned to the function’s parameters, f is evaluated to an rvalue (assumed to be type function). The 
function is then called. Each call is recursive at little cost in time or space. Thus f(), f(2,x+y,g(l)), 
and ff[2](e33) are all function references: in the last example, ff is assumed to be a vector containing 
functions. 

Primary expressions are bound left to right. 

4.2 Unary Operators 

The unary operators are *, &, -, !, ++, —, and 

1. The rvalue (or indirection) prefix unary operator * is described in section 3.0. 

2. The lvalue (or address) prefix unary operator 
& is also described in section 3.0. 
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3. The operand of the negation prefix unary operator - is interpreted as an integer. The result is an 
rvalue with opposite sign. 

4. The NOT prefix unary operator ! takes an integer operand. The result is zero if the operand is non¬ 
zero. The result is one if the operand is zero. 5 . The increment ++ and decrement — unary opera¬ 
tors may be used either in prefix or postfix form. Either form requires an lvalue operand. The con¬ 
tents (rvalue) stored in the lvalue is either incremented or decremented by one. The prefix operators 
return the newly altered value; the postfix forms return the value before it is incremented or decre¬ 
mented. Thus if x currently contains 5, then ++x and x++ both change x to 6. The value of ++x is 6 
while x++ is 5. Similarly, —x and x— change x to 4. The former has value 4, the latter 5. 

6. The ones complement operator when applied to an rvalue, turns all 1 bits to 0 and all 0 bits to 1. 
Unary operators are bound right to left. Thus -!x++ is bound -(!(x++)). 

4.3 Multiplicative Operators 

The multiplicative binary operators *, /, and %, expect rvalue integer operands. The result is also an inte¬ 
ger. 

1. The operator * denotes multiplication. 

2. The operator / denotes division. The result is correct if the first operand is divisible by the second. 
If both operands are positive, the result is truncated toward zero. Otherwise the rounding is unde¬ 
fined, but never greater than one. 

3. The operator % denotes modulo. If both operands are positive, the result is correct. It is undefined 
otherwise. 

The multiplicative operators bind left to right. 

4.4 Additive Operators 

The binary operators + and - are add and subtract. The additive operators bind left to right. 

4.5. Shift Operators 

The binary operators << and >> are left and right shift respectively. The left rvalue operand is taken as a 
bit pattern. The right operand is taken as an integer shift count. The result is the bit pattern shifted by the 
shift count. Vacated bits are filled with zeros. Thus, 8«2 gives 32, while 8»2 gives 2. The result is 
undefined if the shift count is negative or larger than 36. The shift operators bind left to right. 

4.6 Relational Operators 

The relational operators < (less than), <= (less than or equal to), > (greater than), and >= (greater than or 
equal to) take integer rvalue operands. The result is 1 if the operands are in the given relation to one 
another, and O otherwise. 

4.7 Equality Operators 

The equality operators == ( equal to) and ! = (not equal to) perform similarly to the relational operators. 

4.8 AND Operator 

The AND operator & takes bit patterns as operands. The result is the bit pattern that is the bit-wise AND of 
the operands. The AND operator binds left to right. 

4.9 Exclusive OR Operator 

The Exclusive OR operator exactly''performs result is the bik-wise exclusive OR of the operands. The 
Exclusive OR operator binds left to right. 
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4.10 OR Operator 

The OR operator | performs exactly as AND, but the result is the bit-wise inclusive OR of the operands. 
The OR operator also binds left to right. 

4.11 Conditional Expression 

Three rvalue expressions separated by ? and : form a conditional expression. The first expression (to the 
left of the ?) is evaluated. If the result is non-zero, the second expression is evaluated and the third ignored. 
If the value is zero, the second expression is ignored and the third is evaluated. The result is either the 
evaluation of the second or third expression. Thus, a<b?a:b returns the value of a if a is strictly less than b, 
and b otherwise; i.e., it computes min(a,b). 

Binding is right to left. Thus a?b:c?d:e is a?b:(c?d:e). 

4.12 Assignment Operators 

There are 17 assignment operators in B. All require an lvalue on the left and an rvalue on the right. The 
simple assignment operator evaluates=merely the lvalue. The assignment operators =*, =/, =%, =+, = , 
=<<, =>>, =<, =<=, =>, =>=, ===, = ! =, =&, = A , and = | perform a binary operation (See sections 4.3 to 
4.10) between the rvalue stored in the assignment’s lvalue and the assignment’s rvalue. The result is then 
stored in the lvalue. Thus the expression x=*10 is identical to x=x*10. Note that this is not x= *10. 
The result of an assignment is the rvalue for simple assignment, and the result of the binary operation for 
the other assignment operators. Assignments bind right to left; thus x=y=0 assigns zero to y, then x, and 
returns the rvalue zero; similarly, if y contains the rvalue 2, then x = y =+ 3 sets both x and y to 5. 

5.0 Statements 

Statements define program execution. Each statement is executed in sequence. There are, of course, state¬ 
ments to conditionally or unconditionally alter normal sequencing. 

Most statements end with a semicolon ;. 

5.1 Compound Statement 

A sequence of statements in { } braces is syntactically a single statement. This mechanism is provided so 
that where a single statement is expected, any number of statements can be placed. 

5.2 Conditional Statement 

A conditional statement has the general form: 

if ( rvalue ) statement 1 else statement2 ; 

This evaluates r\’alue and executes statementl if the rvalue is non-zero, and statement if the rvalue is zero. 
Notice the analogy with conditional expressions, section 4.10. 

The "else" clause is optional; thus: 

if ( rvalue ) statement ; 

executes statement if the rvalue is nonzero, and skips it if the rvalue is zero. 

As an example, the program segment: 

if ( a<b ) x=a; 
else x=b; 

sets x to the minimum of a and b. 

5.3 While Statement 

The while statement has the form: 

while ( n’alue ) statementl ; 

The rvalue is evaluated; if zero, statement is skipped and control passes to the following statement in the 
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program. If non-zero, statement is executed. The rvalue is then evaluated again and the process is 
repeated until the rvalue becomes zero. 

The while statement can be used to write tight loops quite tersely. Thus, to call the function f 10 times, we 
may write: 

i = 10; 

while( i— ) f (); 

We may also write searches: 
while (v[ + + i]) ; 

sets i to the index of the first zero element in the vector v. 

When statement is a compound statement, it can be broken out of by a break statement (see 5.6). 

5.4 Switch Statement 

The switch statement is the most complicated statement in B. The switch has the form: 

switch ( rvalue ) statementl 

Virtually always, statementl above is a compound statement. Each statement in statementl may be pre¬ 
ceded by one or more cases as follows: 

case constant : 

During execution, the rvalue is evaluated and compared to each case constant in undefined order. If a case 
constant is equal to the evaluated rvalue, control is passed to the statement following the case. Thus, the 
code fragment: 

switch (x) { 

case 'a': 

Y = 1 ; 
case 'b' : 
z = 2; 

} 

sets z to two if x has the value ’b\ sets y to one and z to two if x has the value ’a’, and does nothing other¬ 
wise. 

If the rvalue matches none of the case constants, statement is normally skipped; this can be changed, how¬ 
ever, by preceding a statement in statement by 

default: 

Control then passes to the following statement when none of the case constants are matched. 

A compound statement following a switch block can be broken out of by a break statement (see 5.6). 

5.5 Goto Statement 

The goto statement is as follows: 

goto rvalue ; 

The rvalue is expected to be of type label. Control is then passed to the corresponding label. Transfers 
into and out of compound statements are legal, but use of labels as dummy arguments to transfer between 
functions or function invocations is almost certain to cause disaster. 

5.6 Break Statement 

The break statement has the syntax: 

break ; 

It is used to break out of a compound statement controlled by a while or switch statement; other compound 
statements are ignored by it. If a break statement is within several compound statements controlled by 
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while or switch statements? the innermost compound statement is the one broken out of. Thus, in the pro¬ 
gram fragment: 

while (i—) { 

j = f (i) ; 

switch (j) { 

case 1: 
x = 5 ; 
break; 
case 2 : 
case 3 : 
x = g ( j ) ; 

} 

labl: 

if ( j < 0 ) { 
xy = 3; 
break; 

} 

xy = j/2; 

} 

more: 


the first break statement is equivalent to "goto labl;", and the second is equivalent to "goto more;". 

5.7 Return Statement 

The return statement is used in a function to return control to the caller of the function. The first form sim¬ 
ply returns control. 

return ; 

The second form returns an rvalue for the execution of the function. 

return ( n’alue ) ; 

The caller of the function need not use the returned rvalue. 

A return statement is automatically generated before the closing } of a function definition (See section 7.3). 

5.8 Rvalue Statement 

Any rvalue followed by a semicolon is a statement. The two most common rvalue statements are assign¬ 
ment and function call. 

5.9 Null Statement 

A semicolon is a null statement causing no execution. It is used mainly to carry a label after the last- 
executable statement in a compound statement. It sometimes is used to supply a null body to a while state¬ 
ment (See the second example in 5.3). 

6.0 Declarations 

Declarations in B specify storage class of variables, and also, in some circumstances, specify initialization. 
There are three storage classes in B. Automatic storage is allocated at each function invocation, and 
becomes undefined upon return from the function. External storage is allocated before execution of the 
program, and is available to any and all functions. Internal storage is also allocated before execution, but is 
available to only one function; labels are the only current use of internal storage. Automatic and internal 
declarations result in immediate allocation of storage for the variable; an external declaration allocates no 
storage. Storage is allocated for external variables by means of external definitions (See section 7). 

External and automatic declarations should appear before the first use of the variables declared; the declara¬ 
tion remains in force until the end of the function in which it appears. Internal variables may be used 
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6.1 External Declaration 

The external declaration has the form: 
extrn namel , name2 ... ; 

The external declaration specifies that each of the named variables is of the external storage class. Each of 
the variables must also be externally defined. 

If the first use of a name is immediately followed by a left parenthesis ’(’> the name is typed external by 
default; thus the library functions need not normally be declared. 

6.2 Automatic Declaration 

The automatic declaration also constitutes a definition: 

auto clefl , deft ... ; 
where each def has either the form 
name 
or the form 

name { constant ] 

The automatic declaration defines the variable to be of class automatic, and allocates storage for the vari¬ 
able. When a name is followed by a constant in brackets, the automatic variable is initialized to the base of 
an automatic vector with size equal to constant+1. The actual subscripts used to reference the vector range 
from zero to the value of the constant. 

6.3 Internal Declaration 

The first reference to a variable not declared as external or automatic constitutes an internal declaration. 
The major use of internal declarations is with labels; at the end of each program, internal names not defined 
as labels will cause an error message. A label is defined by writing 

name ; 

preceding any statement. 

7.0 External Definitions 

A complete B program consists of a series of external definitions. Execution is started by the hidden 
sequence 

main (); exit(); 

Thus, it is expected that one of the external definitions is a function definition of main. (Exit is a prede¬ 
fined library function. See section 8.0.) 

7.1 Simple Definition 

The simple external definition allocates storage for an external object and optionally initializes it; the forms 
are: 


name ; 


or 


name ival ; 

In the first form, the external name is defined and initialized with zero. In the second form, an ival (initial 
value) is a constant or a name; the external name is defined and initialized with the value of the constant, or 
the lvalue of the name, respectively. 
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7.2 Vector Definitions 

An external vector definition has one of the following forms: 

name [ ] ; 

name [ constant ] ; 

name [ ] ival , ival .. . ; 

name [ constant ] ival , ival. . . ; 

The name is initialized with the lvalue of the base of an external vector. If the vector size is missing, zero is 
assumed. In either case, the vector is initialized with the list of ivals (See 7.1). The actual size of the vector 
is the maximum of constant+1 and the number of initial values. Any vector elements which are not explic¬ 
itly initialized have undefined values. 

7.3 Function Definitions 

Function definitions have the following form: 

(name ( arguments ) statement 

The name is initialized to the rvalue of the function. The arguments consist of a list of names separated by 
commas. Each name is defined as an automatic variable; the statement (usually compound) defines the 
execution of the function. When the function is invoked, each dummy argument is initialized to the value 
of the corresponding actual argument in the call; there are no side effects on the actual arguments in the 
function invocation. This form of argument passing is frequently called "call by value". 

For example, a function which returns the minimum of its two arguments can be defined by: 

min(a,b) { 

if ( a<b ) return (a); 
else return (b); 

} 


8.0 Runtime Support for B in MH-TSS 
8.1 Input/Output Routines 

By far the largest class of routines in the runtime library for B are the I/O routines. This is largely because 
of the nonuniform way in which I/O is done in MH-TSS, and also because of a desire to give maximum 
facility with a minimum cost in core utilization. 

The underlying I/O for B is character oriented, with two routines, putchar and getchar, forming the basis 
for the system. By default, getchar picks up characters from the teletype, and putchar writes characters to 
the teletype. The I/O facilities also permit the use of I/O units, in a way similar to Fortran. The default 
attachment of putchar and getchar to the teletype can be easily changed to allow the reading and writing of 
one or more ascii disc files concurrently. I/O buffer management is handled by the system. A unique fea¬ 
ture of the runtime system is the existence of an I/O unit which has the effect of sending command lines to 
system; i.e., lines written to this unit behave as it they had been written at SYSTEM? level, and cause 
immediate execution of the specified SYSTEM level commands. A detailed description of the I/O routines 
follows. 

getchar() 

— this routine returns the next character from the current input unit, right justified with zero fill. The cur¬ 
rent input unit is given by the external variable rd.unit, whose default value is O. Unit O is always defined 
as the teletype input unit. If a unit u has never been defined, or is in end-of-file status, getchar will return 
the ascii EOT character, denoted 

putchar(c) 

— this routine puts the four ascii characters in c onto the current output unit. The null character (ascii code 
000) and the EOT character (ascii code 004) are deleted whenever encountered, and are not copied to the 
output unit. Thus character constants may be output with expected results, since the characters which are 
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not explicitly present are filled with nulls; putchar(’ab’) would result in the output of only two characters, 
’a’ and ’b\ to the output unit. 

The user must be careful to indicate explicitly the newline character at the end of each line of output; in 
character constants and strings this has the escape sequence ’*n’. If this is not done, all of the output comes 
in one long line or record, resulting in probable chaos! 

The current output unit is given by the external variable wr.unit, whose default value is 1. Unit 1 is always 
defined as writing the teletype. Another predefine unit is unit -1, which is always defined definition as the 
system unit; a line written onto unit -1 acts as if it was typed at "SYSTEM?" level in MH-TSS . This gives 
users a simple and flexible way of using many MH-TSS features without requiring a lot of resident code, 
or fancy bookkeeping. 

In general, a user can switch units by simply assigning an integer from -1 through 10 to either rd.unit or 
wr.unit. The smart user will make a copy of the current unit setting and restore it after his function is com¬ 
pleted; this increases the generality of his program. 

putchar(c) returns c as its value; thus the statement: 

while( putchar (getchar () ) != '*n') ; 

copies a line from the input unit to the output unit. 

We have discussed the default settings for units 0, 1, and -1. In order to read and write on ascii disc files, it 
is necessary to open (i.e., access) the file, and associate the file With a unit number: Thus, we have the two 
functions "openr" and "openw". 

openr(u,s) 

— the string s is taken as a cat/file description, and the file is rewound and opened as a sequential ascii disc 
file with read permission. An empty string s is taken to mean the teletype. The file will now be known as 
unit number u; the external variable rd.unit is set to u, so that the next call to getchar, getstrt etc. will cause 
input from this file. If u is not between 2 and 10, openr does nothing. If unit u is open when this function 
is called, it is closed before the access (See close, below). For error handling, see the end of this section. 

openw(u,s) 

— the string s is taken as a cat/file description and the file is rewound and opened as a sequential ascii disc 
file with read and write permission. The null string s is taken to mean the teletype. The file will now be 
known as unit number u; the external variable wr.unit is set to u, so that the next call to putchar, putstr, 
printf, etc. will cause output onto this file. If u is not between 2 and 10, openw does nothing. If unit u is 
open when this function is called, it is closed before the access (See close, below). For error handling, see 
the end of this section. 

The next three I/O functions are string, rather than character, oriented. They are implemented by repeated 
calls to getchar and putchar to do the I/O, followed by packing or unpacking the strings. 

getstr(a) 

— This function reads the next input line into the vector a. The newline character at the end of the input line 
is replaced by the string delimiter ’*e\ The user is responsible for seeing that the vector a is big enough to 
hold the input string in all cases; this must include leaving room for the string delimiter. This function 
returns a as its value. 

putstr(a) 

— This function copies the string a to the current output unit by repeated calls to putchar. The characters 
are output successively up to, but not including, the string delimiter ’ *e’ . Thus, 

putstr (getstr (a)); 
putchar ('*n' ); 

copies a complete line from the input to the output unit. 

system(a) 

— This function copies the string a to the system output unit, unit -1. This has the same effect as if the 
string a had been typed at "SYSTEM?" level. A newline is not needed at the end of the string a, since the 
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system function adds one. 

The next three I/O functions do important nonstandard operations. 

close(u) 

— the unit number u is endfiled, and, if u was an ascii disc file, the space for its buffers is reclaimed. Files 
assigned to the teletype have the current line (if nonempty) written out onto the teletype with no following 
newline; thus this function can be used to force a prompting question before reading a response on the 
same line. 

Upon normal termination, each output unit is closed. In some cases of abnormal termination, control is 
lost and no wrapup is possible; in this case, the units may not be closed. 

flush() 

— the current output file is closed. This is the same as close(wr.unit). 

reread() 

— this function causes the current line to be reread. It is currently undefined for ascii input files; on the tele¬ 
type, it means that the next character to be read will be the first character of the current line. There is an 
important special meaning of reread: if reread is called before the first call to getchar, the first line read will 
be the "command"; i.e., the line which was typed to initiate execution. 

The highest level output function is printf; it is also one of the most useful. 

printf(fmt,al,a2,a3,..,,alO) 

— This function writes the arguments al, ..., alO on the current output unit under the control of the string 
fmt. From 0 to 10 of the arguments a may be present. The format conversion is controlled by two-letter 
sequences of the form ’%x’ inside the string fmt, where x stands for one of the following: 

c — character data (in ascii) 
d — a decimal number 
o — an octal number 
s — a character string 

The characters in the string fmt which do not appear in one of these two-character sequences are copied 
without change to the output unit. 

Thus, the call: 

printf("%d + %o is %s or %c*n", 1,-1,"zero",'0'); 

causes the output line: 

1 + 777777777777 is zero or 0 

As another example, a permanent file whose name is in the string s can be created with size n blocks and 
general read permission by using printf together with the system output unit and the filsys subsystem: 

wr.unit = -1; 

printf ("filsys cf %s,b/%d/,r*n", s, n); 

There are several additional points to be noted about the I/O system. 

The first is that if a file which is being written on is too small, it is automatically grown. 

The second point involves the treatment of cat/file descriptions. There are two types of cat/file descrip¬ 
tions: complete descriptions and quick access descriptions: A complete description is a string containing at 
least one "/" character; a quick access description contains no "/" character. 

A complete description is viewed as accessing a permanent file in the file system. Thus, if there is a file of 
the same name already accessed (in the AFT) the previous file is deaccessed and the access repeated. If the 
file being accessed does not exist, the access fails. 

A quick access description is viewed as accessing a file which may or may not be permanent and may or 
may not be already accessed. Thus, if a file of the given name is already accessed, it is used. If not, the 
user’s catalog is searched for a file of that name. If the file is not found, a temporary file of that name is 
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created, provided that the file is being opened for output. 

Currently, cat/file descriptions may not contain permissions, passwords, subcatalogs, or alternate names. 

Finally, there are two modes of error handling in the I/O system. Under the default mode, failure to access 
a file will cause an error message and termination of the job. Under the anything goes mode, access fail¬ 
ures are not directly reported to the user, and execution continues. In this case, the functions openr and 
openw return 0 if the access was successful, and nonzero otherwise. In all cases, failure to grow an output 
file causes termination of execution. 

An attempt to read a character from a nonexistent file causes character ’*e’ to be returned. Writing on a 
nonexistent file causes output to disappear without a trace. The error handling mode may be changed by 
calling the routine ioerrors: 

ioerrors ( n ); 

will set the mode to the default if n is O, and to "anything goes if n is 1. 

8.2 Useful Routines 

This section describes a number of useful functions for simple character and string manipulation, and stor¬ 
age allocation. 

char(s,n) 

— returns the value of the nth ascii character, in the string s, right justified with zero fill. Characters are 
numbered from left to right, starting at zero. Thus char("abc",l) returns ’b’. 

lchar(s,n,c) 

— This function replaces the nth ascii character of string s by the rightmost nine bits of word c. c is 
returned as a value. Thus, if s has the value "abc", lchar(s,l,’x’) returns the value ’x’, and sets s to have the 
value "axe". 

charb(s,n) 
lcharb(x,n, c) 

— These functions are the same as char and lchar, except that they operate with six bit characters. 

asebed(v,n, s) 

— This function converts the first n ascii characters of the string s to bed, and stores them packed into the 
vector v. If the string s has fewer than n characters, trailing blanks are supplied to fill out v to n characters. 

bedase(s,v,n) 

— converts the first n bed characters of the vector v into lower case ascii, and puts them into the string s, 
followed by a ’*e’ character. Trailing blanks are deleted. The user is responsible for seeing that the vector 
s has enough room to store the output string, which may be as long as n+1 ascii characters, including the 
string delimiter. This function returns s as a function value. 

concat( a, bl, b2, .... blO) 

— This function concatenates the strings bl through blO together (from O to 10 such arguments can be pre¬ 
sent) and copies the resulting string into a. The user is responsible for seeing that the vector a is large 
enough to hold the resulting string, a is returned as a function value. 

getarg(a,b,n) 

— A rudimentary scanning function is provided by this routine, getarg puts into string a the first argument 
at or after character n in string b. It returns a new value of n which contains the index of the first character 
after the argument returned. An argument is defined as a string of characters separated by blanks or tabs. 

For example, the code: 

n = getarg(al,b,0); 
n = getarg(a2,b,n); 
getarg(a3,b,n); 

would set al, a2, and a3 to the first three arguments of b. Thus, if b were the string: 
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"well done" 

al would become "well", a2 would become "done", and a3 would contain the null string, b would be 
unchanged. 

8.2 Other Functions 

getvec(n) 

— this function returns a vector of size n (i.e. n+1 words) from the core hole. The memory size used by the 
job will grow if necessary to get the space. 

rlsevec(v,n) 

— this function releases the n+1 words of the static vector v back to the system. Calling this routine with an 
automatic vector v as argument results in immediate or eventual disaster! 

nargs() 

— returns the number of arguments with which the currently executing function was called. 

exit () 

— terminates the run gracefully, closing all open output files. A return from the main procedure simulates a 
call to exit. 

9.0 Examples 

The examples appear exactly as given to B. 


/* The following complete B program, if compiled and put on your 
file "hstar", will act as an ascii file copy routine; the command 
at "SYSTEM?" level: 

/hstar filel file2 

will copy filel to file2. */ 

main () { 

auto j,s[20],t[20] ; 

reread)); /* get command line */ 

getstr(s); /* put into s */ 

j = getarg(t,s,0); /* skip H* name */ 

j = getarg(t,s,j); /* filel */ 

openr ( 5,t ); 

getarg(t,s,j); /* file2 */ 
openw( 6,t ); 

while) putchar ( getchar)) ) != '*e' ) ; /* copy contents */ 

} 
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/* This function is called with a string s of the form nnn, nnn, 

nnn, ... , where the nnn are integers. The values are placed 
in successive locations in a vector v. The number of integers 
converted is returned as a function value. This program 
provides a simple illustration of the switch and case state¬ 
ments. */ 

convert(s,v) { 

auto m, i, j , c,sign; 

i = 0; /* vector index ' / 

j =-l; /* character index */ 

init: /* initialize to convert an integer */ 
m = 0; /* the integer value '/ 

sign = 0; /* sign = 1 if the integer is negative */ 

loop: /* convert an integer */ 
switch (C = char(s,++j)) { 
case ': 

if(sign) goto syntax; 
s = 1; 

case ' ' : 

goto loop; 

case '*e' : 

case ' , ' : /* delimiter . . . store converted value */ 

v[i++] = sign?(-m):m; 

if ( c == '*e' ) return(i); 

goto init; 

} 


/* none of the above cases . . . if a digit, add to m */ 

if ( '0' <= c & c <= '9' ){ 
m = 10*m + c- 'O' ; 
goto loop; 

} 


/* syntax error . . . print message and return -1 */ 

syntax: 

printf("bad syntax*n"); 
return(-1 ); 

} 


9.3 
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/* This function replaces each upper case character in the input 
string s by its lower case equivalent. It uses the fact that 
the ascii alphabetic characters are contiguous. */ 

lower(s) { 


auto c, i; 
i = -1 ; 

while ( (c=char (s,++i)) != ' *e' ) 

if( c >= 'A' & c <= 'Z' ) lchar(s~i~c-'A'+ 'a'); 

} 


9.4 


/* This function prints out an unsympathetic error message on the 
terminal for each integer value of errno from 0 to 5 */ 

snide(errno) { 

extrn wr.unit, mess; 

auto u; /* temporary storage for the unit number */ 
u = wr.unit ; wr.unit = 1; 

printf("error number %d, %s*n'*,errno,mess[errno]"); 

wr.unit = u; 

} 

mess [5] "too bad", "tough luck", "sorry, Charlie", "that's the breaks", 
"what a shame", "some days you can't win"; 


10.0 Compiling and executing programs in MH-TSS 

B programs, can be run in MH-TSS through the use of the command ./bj (for "B job"). This command, 
written by B. W. Kernighan and s. C. Johnson, provides an interface between the B compiler and the oper¬ 
ating system. The syntax of the command is: 

./bj [(options)] [source [hstar] ] [spec] [spec] [spec] . . . 

In the above command, brackets [] indicate optional arguments. Source is the cat/file description for the B 
source file, hstar is the cat/file description for the H* output file, and options stands for the jrun options. 
All cat/file descriptions in this command which do not contain a user master catalog name are taken to be 
part of the current user master catalog. The arguments denoted by spec above give other specifications for 
the running of the job; there are the following types of specification: 

h=cat/file 
r=cat/file 
l=cat/file 
s=nnn 

The construction "h=cat/file" is used to specify an H* file when there is no source file being compiled; i.e., 
the H* file is being entirely constructed from libraries. The construction "r=cat/file" specifies that the ran¬ 
dom library cat/file should have the output of the B compilation placed on it by means of she RANEDIT 
automatic edit facility. The construction "l=cat/file" specifies that the library cat/file should be searched by 
the loading process as part of constructing the H* file. The "r=" construction also specifies that the associ¬ 
ated cat/file should be searched; i.e., it causes an implied "1=" construction. The libraries specified by "1=" 
and "r=" constructions are searched in the order in which they appear on the command line. The specifica¬ 
tion "s=nnn" is used to specify an optional size for the stack in B; the default value is 500 words. Two 
words of stack are used for each function invocation, together with one word for each argument, and space 
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for automatic variables and a few temporaries. In this construction, nnn is an integer. 

The user should ideally possess an IDENT file, exactly as required by the QED utility programs such as 
./list, etc. If no IDENT file is found, a temporary IDENT file is created by asking the user for his ident 
card image and his userid. 

The "./bj" command works as follows: the first two passes, ./bl and ./b2, of the B compiler are run in MH- 
TSS. The result of the second pass of the compiler is a GMAP program, which must be sent to the batch 
world, compiled, and loaded with the B I/O library (on file ./blib) to create an executable H* file. The ./bj 
command calls ./bl and ./b2, and if no errors are detected, the GMAP deck is submitted to the batch world; 
currently, this is done using "jrun". The snumb of the job is printed out on the terminal. When the job is 
completed, it can be run by typing "/hstar" at system level, where hstar is the name of the H* file in the ./bj 
command line. 

Several examples follow: 

./bj name hstar 

This is the simplest command; the B program on file "name" is compiled, and the resulting H* file put on 
file hstar. 


./bj (w) name hstar 

This is the same as the first example, except that the GMAP compilation is submitted with "wait" disposi¬ 
tion by jrun. 

./bj name hstar r=abc/blib 

This is the same as the first example, except that the object deck is saved on the library abc/blib. (Note that 
a RANEDIT job will have to be run to clear this library before any decks are put on it). 

./bj name hstar l=libl r=lib2 l=lib3 

This is the same as the first example except that the object deck is placed on library lib2m and the libraries 
libl, lib2, and lib3 are searched, in that order, to create the file hstar. 

11.0 Advanced Topics 

This section describes the execution environment of B programs, and tells how to write B-callable GMAP 
subroutines. 

B programs are recursive; the stack pointer is kept in index register 7. Thus, during the execution of any 
function, the return address is kept in address 0,7 , and the values of the first, second, etc. arguments are 
kept in addresses 1,7,2,7,3,7 , etc. If there are n arguments, the nth has address n,7; the locations from 
n+1,7 onward are used for automatic variables and temporaries. 

When one B program calls another, index register 7 must be moved forward to an unused portion of the 
stack; by convention, this is done by the called program. A call to a routine "sub" causes the following 
GMAP code to be generated: 

tsxl sub,* 

zero s,n 

In this call, s is an integer telling how much index register 7 should be advanced to avoid conflicting with 
the calling program’s use of the stack, n contains the number of arguments; this is never used by the calling 
sequence, but is available to the user by calling the library procedure nargs. Notice that the tsxl is indirect 
through the word sub, instead of directly to it. Functions and labels in B, because they point to executable 
code, contain their addresses in the left half of the 36 bit word; this means they can be indirected through. 
In contrast, pointers which point to data (such as vectors) have their addresses in the right half of the word. 

Upon entry to the routine sub, the stack frame is bumped and the return address is stored on the stack; the 
first three instructions are typically: 
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sub 

zero 

* + l 

pointer 

to code 


adx7 

0,1 

advance 

stack pointer 


stxl 

0,7 

store return address on 


The return from this function is the reverse process: 


ldxl 

0,7 

restore 

return address 

sbx7 

0,1 

restore 

stack pointer 

tra 

1,1 

return 



stack 


Because the return sequence is identical for all functions, and a function may have several returns, this 
sequence of code is stored in the runtime library with the external name .10000 . Thus, the above lines are 
replaced by: 

symref .10000 
tra .10000 


Arguments are passed by value, and never copied back to the calling routine. If there is one argument, it is 
passed in the A register; if two, they are passed in the A and Q registers. If there are more than two, the 
third and later arguments are computed and stored on the stack so that they become addresses 3,7,4,7 , and 
so on after the stack frame is advanced. B programs treat their arguments exactly as other automatic vari¬ 
ables; it is necessary, thus, to move the first two arguments from the registers onto the stack. This is done 
by a 

sta 1,7 


or 


staq 1,7 

immediately following the entry sequence given above. The instruction staq demands that its effective 
address be even. Thus , we have two restrictions designed to assure that index 7 is always odd: 

a. Index 7 is initially odd. 

b. In calls, the stack is always bumped by an even amount. 

Functions return a value in the Q register. The indicators are not assumed set on return, since the .10000 
sequence destroys them. 

From the above discussion, the reader can write general B-callable GMAP subroutines of arbitrary com¬ 
plexity. If the subroutine has at most two arguments, however, the stack need not be changed, since the A 
and Q registers contain the first two arguments. The general rules for this situation are: 

a. The A and Q registers contain the first two arguments. 

b. Index registers 2 through 7 must be restored to their previous values upon return. 

c. Return is to location 1,1 

As an example, the following rather useless subroutine returns the sum of its two arguments: 


$ 

gmap 





symdef 

add2 



add2 

zero 

* + l 




sta 

temp 

store first 

argument 


adq 

temp 

add to second 


tra 

1,1 

return with 

result in 

temp 

zero 





end 





Simple GMAP functions of this type are particularly useful as interfaces to MH-TSS system calls 
("derails"). 
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12.0 Nasties 

This section describes the uglier features of B on MH-TSS. 

1. The compiler makes sense of certain expressions with operators in ambiguous cases (e.g. a+++b) but 
not others even in unambiguous cases (e.g. a+++++b). 

2. The GMAP assembler flags instructions of the form: 

staql,7 

with "A" flags; in fact, these instructions always work properly, and are no cause for alarm. 

3. External names in GECOS may only be six characters in length, and are in bed so that case distinc¬ 
tions are lost. Thus, the B names aaaaaa, AAAAAA, and aaaaaaxx all refer to the same external 
function. 

4. The use of the operator & in conditional statements has been optimized to allow for conditional trans¬ 
fers and much more efficient object code; unfortunately, there is a conflict between this use and the 
use of this operator in its more usual sense as a bitwise logical operator. Thus, the statement: 

if ( a&077 )goto label; 

is ambiguous; on the one hand, it could mean "if the last two octal digits of a are not 00, go to label" 
(& used as a bitwise operator), while on the other hand it could mean if both a and 077 are nonzero, 
go to label" (& as a logical operator). This ambiguity is broken by always assuming the logical oper¬ 
ator in a case where a truth value is expected (if, while, and conditional expression). To force the bit¬ 
wise interpretation in the above example, one must write 

if ( (a&077) != 0 )goto label; 

which is in fact optimized well by the code generator. 

5. Missing subroutines, perhaps caused by misspelling, cause the GECOS loader to insert a MME 
GEBORT at the references. This, when executed in MH-TSS, gives the message "Illegal opcode". 
Check the load map to make sure. 

13.0 Diagnostics 

Diagnostics consist of two letters, an optional name, and a source line number. Due to the free format of 
the source, the source line number might be slightly too high. The following is a list of the diagnostics. 

[Table omitted; same as in Kernighan tutorial — DMR] 

14.0 Future Plans 

There are a number of desirable features missing from B in MH-TSS. These include the ability to use full 
cat/file descriptions when accessing a file, the ability to read and write BCD, random, and binary files, and 
the ability to run in batch. A number of these features could be added at once if an I/O interface to GFRC 
were written; this seems like the most likely future extension to the I/O library. It is almost certain that 
getchar and putchar will remain functionally unchanged in this case. Users are urged to gather their file 
accessing and error handling routines in a small number of clearly marked spots, both as a matter of good 
programming practice and to facilitate any future changes that might be needed. 

B works tolerably well on the H6070, which is a word addressable machine; when using a byte addressable 
machine such as the IBM 360/370 models or the PDP-11, B seems less attractive. A successor language, C, 
is being developed which allows most of the advantages of B on byte addressable machines, as well as a 
structure capability. While the case for C on the H6070 is not as strong as it is on byte addressable 
machines, the structure and character manipulation capabilities make it likely that C will eventually appear 
on the H6070. 

A final area where progress is both possible and desirable is in the area of debugging aids. It would be 
fairly easy to add a symbol table to the compiler output, provided that programs were written to access 
these tables. 
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